In this analysis we tested whether intraspecific variability (IV) in observed individual tree growth can emerge from the environment only. To this aim, we used a clonal experimental setup. The EUCFLUX common garden is a clonal trial where 14 Eucalyptus genotypes were grown in a replicated, statistically-sound design. One of its main goals is to determine and compare the productivity of each genotype. Our hypothesis was that IV in tree growth mainly results from responses to environmental factors, and not only from intrinsic genetic factors. Therefore, we used the EUCFLUX dataset in order to quantify IV (within genotypes) for growth, i.e. in a dataset where genetically-driven IV is nil. Following our hypothesis, we expected to detect IV in growth within single genotypes.

1 The dataset

The EUCFLUX experiment is located in Brazil, in the state of São Paulo. It includes 14 genotypes of five different Eucalyptus species or hybrids. Each genotype is planted in plots of 100 trees, at a density of 1666 trees per hectare, and replicated spatially in ten blocks. The experimental set-up was designed to minimize the variation in environmental factors among blocks, which were separated by less than 1.5 km within a homogeneous 200-ha stand showing small variation in soil properties. Tree DBH (Diameter at Breast Height) has been measured over five complete censuses, spanning six years, age at which such plantation is generally harvested. We used these measures in order to compute mean annual growth in mm/year at five points in time. Mean annual growth of each tree in mm/y was computed as the DBH difference between two consecutive censuses divided by the time between the two censuses. In case of mortality of the tree between two censuses, the data was discarded. We computed the neperian logarithm of diameter and growth (with a constant for growth in order to avoid undefined values).

The dataset included 64125 growth estimates corresponding to 13531 trees.

The original data without negative growth values (a) and the log-transformed growth (b).

Figure S1.1: The original data without negative growth values (a) and the log-transformed growth (b).

Figure 1.1 shows the distribution of growth after removing negative values, and the latter with log-transformed values.

Design of a plot. Each point is a tree and the associated number is the tag of the tree.

Figure S1.2: Design of a plot. Each point is a tree and the associated number is the tag of the tree.

Figure 1.2 shows the disposition of the trees in a single plot. There are 14 genotypes times 10 repetitions, so 140 plots with this same design.

2 Competition index

Plot of the growth versus the diameter. Each colour represents a tree age.

Figure S2.1: Plot of the growth versus the diameter. Each colour represents a tree age.

Figure 2.1 shows the age of the trees has a big influence on the values of growth but also on the relationship between growth and diameter: the slope is smaller with time, indicating that for the same diameter, growth is slower through time. This is likely an effect of competition for light and possibly underground resources, since as the trees grow their capacity to capture resources increases. Therefore, we computed a competition index \(C_{i,t}\) to integrate this effect in the growth model. The competition index was computed for each tree which was not on the edge of a plot. It is the sum of the basal areas (BA) of the 8 direct neighbours (there was no need to divide by the area of the rectangle that comprises all the direct neighbours, since this latter is a constant by construction of the experimental design). It was then log-transformed. Dead neighbours were considered as having a null BA.

\(C_{i, t} = \sum BA_{neighbours(i, t)}\)

3 Statistical growth model

In order to partition the variance of individual growth data, we built a model incorporating a fixed effect on the intercept (\(\beta_0\)), on the slope of diameter D (\(\beta_1\)), and on the competition index C (\(\beta_2\)) and several random effects, namely temporal (date of census, \(b_t\)), individual (tree identifier, \(b_i\)), spatial (block, \(b_b\)), and genotype (\(b_g\)).

This model was fitted in a hierarchical Bayesian framework using the brms package (Bürkner 2017, Bürkner 2018 ) with 10,000 iterations, a warming period of 5,000 iterations and a thinning of 1/5, and 4 MCMC chains with different initial values. We obtained 1,000 estimates per parameter per MCMC chain.

Variables were scaled before inference.

\(ln(G_{it}+1) = (\beta_0 + b_i + b_b + b_g + b_t) + \beta_1 \times ln(D_{it}) + \beta_2 \times ln(C_{it}) + \epsilon_{it}\)

Priors

\(\beta_0 \sim \mathcal{N}(mean=0, var = 1), iid\)

\(\beta_1 \sim \mathcal{N}(mean=0, var = 1), iid\)

\(\beta_2 \sim \mathcal{N}(mean=0, var = 1), iid\)

\(b_i \sim \mathcal{N}(mean=0, var=V_i), iid\)

\(b_b \sim \mathcal{N}(mean=0, var=V_b), iid\)

\(b_g \sim \mathcal{N}(mean=0, var=V_g), iid\)

\(b_t \sim \mathcal{N}(mean=0, var=V_t), iid\)

\(\epsilon_{it} \sim \mathcal{N}(mean=0, var=V), iid\)

Hyperpriors

\(V_i \sim t(3, 0, 2.5), iid\)

4 Results of the model and variance partitioning

After convergence of the model, we examined the proportion of the residual variance (variation of the response variables that is not explained by the covariates , i.e. in the unexplained part of the statistical model) related to each random effect, and this enabled us to perform a residual variance partitioning.

Trace of the posteriors of the inferred parameters

Figure S4.1: Trace of the posteriors of the inferred parameters

Density of the posteriors of the inferred parameters

Figure S4.2: Density of the posteriors of the inferred parameters

Trace of the temporal random effects

Figure S4.3: Trace of the temporal random effects

Trace of the genotype random effects

Figure S4.4: Trace of the genotype random effects

Trace of the spatial (block) random effects.

Figure S4.5: Trace of the spatial (block) random effects.

Mean values and 95% confidence interval of the temporal, genetic and spatial and random effects.

Figure S4.6: Mean values and 95% confidence interval of the temporal, genetic and spatial and random effects.

Table S4.1: Mean posteriors of the model and their estimation errors.
Intercept (\(\beta_0\)) Diameter (\(\beta_1\)) Competition (\(\beta_2\)) Individual variance (\(V_i\)) Block variance (\(V_b\)) Genetic variance (\(V_g\)) Temporal variance (\(V_t\)) Residuals variance (\(V\))
Estimate -3.5e-02 5.5e-01 -2.7e-01 2.3e-01 6e-02 1.3e-01 1.3e+00 5.1e-01
Estimation error 5e-01 5e-03 8.9e-03 4e-03 1.8e-02 3.1e-02 5.4e-01 2e-03
Table S4.2: Summary of the model’s outputs
Intercept (\(\beta_0\)) Diameter (\(\beta_1\)) Competition (\(\beta_2\)) Individual variance (\(V_i\)) Block variance (\(V_b\)) Genetic variance (\(V_g\)) Temporal variance (\(V_t\)) Residuals variance (\(V\))
Estimate -3.5e-02 5.5e-01 -2.7e-01 2.3e-01 6e-02 1.3e-01 1.3e+00 5.1e-01
Estimation error 5e-01 5e-03 8.9e-03 4e-03 1.8e-02 3.1e-02 5.4e-01 2e-03
95% interval -1e+00 - 9.8e-01 5.4e-01 - 5.6e-01 -2.8e-01 - -2.5e-01 2.2e-01 - 2.3e-01 3.5e-02 - 1e-01 8.9e-02 - 2.1e-01 6.2e-01 - 2.6e+00 5.1e-01 - 5.2e-01
R-hat 1e+00 1e+00 1e+00 1e+00 1e+00 1e+00 1e+00 1e+00
Bulk ESS 3.3e+03 3.8e+03 3.8e+03 3.6e+03 3.3e+03 3.1e+03 3.4e+03 3.9e+03
Tail ESS 3.7e+03 3.9e+03 3.6e+03 3.5e+03 3.6e+03 3.7e+03 3.6e+03 3.8e+03

We found that the two most important contributors to variance were the date and individual identity. High estimation error for the intercept and the temporal random effect must be noted. The proportion of variance represented by each random effect and the residual variance were computed to visualise the variance partition.

Proportion of each variance component of the unexplained variance.

Figure S4.7: Proportion of each variance component of the unexplained variance.

The model showed that individual tree growth was a function of tree size and competition with neighbouring trees, and that variance around this model was mostly due to a temporal effect as well as an individual effect (Table 4.1, Figure 1.2). The effect of the genotype was quite small, and the effect of the block was even smaller. The temporal random effects declined with time(Figure 4.6, panel a), showing that the effect of the date on growth is negative (the older the trees become, the less they can grow). We attribute this tendency to competition. Therefore, the competition index C did not fully capture the effect of competition on growth. Another explanation is that the diameter slope was not able to fully capture the decrease of tree growth with size, maybe due to geometrical effects of distributing growth around increasing diameter or physiological constraints linked to height. The temporal effect explained the highest fraction of variance (Table 4.1, Figure 2.1). This could be due to the negative effect of competition for light, water, and/or nutrients on growth, which increases with the growth of the trees planted at high densities, and to physiological changes occurring with age.

Importantly, variability between individuals within genotypes was higher than between genotypes. This shows that there is an observed individual variability within genotypes even if trees are clones. Estimated individual variability within genotypes can only be caused by exogenous factors in this particular case. Individual effects can be due to the micro-environment where each tree thrives in, but also to some individual history, such as seedling manipulation and plantation.

The block had the littlest impact. This is probably due to the fact that environmental conditions between blocks are quite homogeneous. As the experimental design aimed at minimizing environmental variations and selected productive genotypes able to accommodate several environmental conditions (Maire et al. 2019), this dataset is a strongly conservative test case for our hypothesis.

5 Conclusion

Overall, we found that there is IV within clonal tree plantations (Figure 5.1). This shows that IV can be due only to the environmental factors varying between individuals, proving that the source of IV is not necessarily intrinsic.

The EUCFLUX setup and the variance partitioning. Site (a; each square represents a block) and organisation of a block (b; each coloured square represents a genotype) in the EUCFLUX experiment. 16 clones are represented in B, but only 14 were used since the last two are from seed-origin and thus not genetically identical. (c) Variance partitioning of tree growth for a common garden experiment with various Eucalyptus clones

Figure S5.1: The EUCFLUX setup and the variance partitioning. Site (a; each square represents a block) and organisation of a block (b; each coloured square represents a genotype) in the EUCFLUX experiment. 16 clones are represented in B, but only 14 were used since the last two are from seed-origin and thus not genetically identical. (c) Variance partitioning of tree growth for a common garden experiment with various Eucalyptus clones

6 Code implementation

The whole analysis was conducted using the R language (R Core Team 2021) in the Rstudio environment (RStudio Team 2021). The tables were made with the kableExtra package (Zhu 2021), the figures with the package ggplot2 (Wickham 2009), and the code uses other packages of the Tidyverse (Wickham et al. 2019) (dplyr (Wickham et al. 2021), lubridate (Grolemund and Wickham 2011), magrittr (Bache and Wickham 2020)) and other R packages (here (Müller 2020), bayesplot (Gabry and Mahr 2021)). The pdf and html documents were produced thanks to the R packages rmarkdown Xie et al. (2020), knitr Xie (2014) and bookdown (Xie 2017).

References

Allaire, J., Y. Xie, J. McPherson, J. Luraschi, K. Ushey, A. Atkins, H. Wickham, J. Cheng, W. Chang, and R. Iannone. 2020. Rmarkdown: Dynamic documents for R.
Bache, S. M., and H. Wickham. 2020. Magrittr: A forward-pipe operator for r.
Bürkner, P.-C. 2017. Brms : An R Package for Bayesian Multilevel Models Using Stan. Journal of Statistical Software 80.
Bürkner, P.-C. 2018. Advanced Bayesian Multilevel Modeling with the R Package brms. The R Journal 10:395.
Gabry, J., and T. Mahr. 2021. Bayesplot: Plotting for bayesian models.
Grolemund, G., and H. Wickham. 2011. Dates and Times Made Easy with lubridate. Journal of Statistical Software 40.
Maire, G. le, J. Guillemot, O. C. Campoe, J.-L. Stape, J.-P. Laclau, and Y. Nouvellon. 2019. Light absorption, light use efficiency and productivity of 16 contrasted genotypes of several Eucalyptus species along a 6-year rotation in Brazil. Forest Ecology and Management 449:117443.
Müller, K. 2020. Here: A simpler way to find your files.
R Core Team. 2021. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
RStudio Team. 2021. RStudio: Integrated development environment for r. RStudio, PBC, Boston, MA.
Wickham, H. 2009. Ggplot2: Elegant graphics for data analysis. Springer, New York.
Wickham, H., M. Averick, J. Bryan, W. Chang, L. McGowan, R. François, G. Grolemund, A. Hayes, L. Henry, J. Hester, M. Kuhn, T. Pedersen, E. Miller, S. Bache, K. Müller, J. Ooms, D. Robinson, D. Seidel, V. Spinu, K. Takahashi, D. Vaughan, C. Wilke, K. Woo, and H. Yutani. 2019. Welcome to the Tidyverse. Journal of Open Source Software 4:1686.
Wickham, H., R. François, L. Henry, and K. Müller. 2021. Dplyr: A grammar of data manipulation.
Xie, Y. 2014. Knitr: A Comprehensive Tool for Reproducible Research in R. in V. Stodden, F. Leisch, and R. D. Peng, editors. Implementing reproducible research. CRC Press, Taylor & Francis Group, Boca Raton.
Xie, Y. 2015. Dynamic documents with R and Knitr. Second edition. CRC Press/Taylor & Francis, Boca Raton.
Xie, Y. 2017. Bookdown: Authoring books and technical publications with R Markdown. CRC Press, Boca Raton, FL.
Xie, Y. 2021. Knitr: A general-purpose package for dynamic report generation in r.
Xie, Y., J. J. Allaire, and G. Grolemund. 2019. R Markdown: The definitive guide. CRC Press, Taylor; Francis Group, Boca Raton.
Xie, Y., C. Dervieux, and E. Riederer. 2020. R markdown cookbook. First edition. Taylor; Francis, CRC Press, Boca Raton.
Zhu, H. 2021. kableExtra: Construct complex table with ’kable’ and pipe syntax.